Paragraph Specific N-Gram Approaches to Automatically Assessing Essay Quality

نویسندگان

  • Scott A. Crossley
  • Caleb Defore
  • Kris Kyle
  • Jianmin Dai
  • Danielle S. McNamara
چکیده

In this paper, we describe an n-gram approach to automatically assess essay quality in student writing. Underlying this approach is the development of n-gram indices that examine rhetorical, syntactic, grammatical, and cohesion features of paragraph types (introduction, body, and conclusion paragraphs) and entire essays. For this study, we developed over 300 n-gram indices and assessed their potential to predict human ratings of essay quality. A combination of these n-gram indices explained over 30% of the variance in human ratings for essays in a training and testing corpus. The findings from this study indicate the strength of using n-gram indices to automatically assess writing quality. Such indices not only explain text-based factors that influence human judgments of essay quality, but also provide new methods for automatically assessing writing quality.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automated Assessment of Paragraph Quality: Introduction, Body, and Conclusion Paragraphs

Natural language processing and statistical methods were used to identify linguistic features associated with the quality of student-generated paragraphs. Linguistic features were assessed using Coh-Metrix. The resulting computational models demonstrated small to medium effect sizes for predicting paragraph quality: introduction quality r2 = .25, body quality r2 = .10, and conclusion quality r2...

متن کامل

Syntagmatic, Paradigmatic, and Automatic N-Gram Approaches to Assessing Essay Quality

Computational indices related to n-gram production were developed in order to assess the potential for n-gram indices to predict human scores of essay quality. A regression analyses was conducted on a corpus of 313 argumentative essays. The analyses demonstrated that a variety of n-gram indices were highly correlated to essay quality, but were also highly correlated to the number of words in th...

متن کامل

Incorporating learning characteristics into automatic essay scoring models: What individual differences and linguistic features tell us about writing quality

This study investigates a novel approach to automatically assessing essay quality that combines natural language processing approaches that assess text features with approaches that assess individual differences in writers such as demographic information, standardized test scores, and survey results. The results demonstrate that combining text features and individual differences increases the a...

متن کامل

Investigating neural architectures for short answer scoring

Neural approaches to automated essay scoring have recently shown state-of-theart performance. The automated essay scoring task typically involves a broad notion of writing quality that encompasses content, grammar, organization, and conventions. This differs from the short answer content scoring task, which focuses on content accuracy. The inputs to neural essay scoring models – ngrams and embe...

متن کامل

Paragraph vector based topic model for language model adaptation

Topic model is an important approach for language model (LM) adaptation and has attracted research interest for a long time. Latent Dirichlet Allocation (LDA), which assumes generative Dirichlet distribution with bag-of-word features for hidden topics, has been widely used as the state-of-the-art topic model. Inspired by recent development of a new paradigm of distributed paragraph representati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013